Frequentist model averaging for analysis of dose–response in epidemiologic studies with complex exposure uncertainty

In epidemiologic studies, association estimates of an exposure with disease outcomes are often biased when the uncertainties of exposure are ignored. Consequently, corresponding confidence intervals (CIs) will not have correct coverage. This issue is particularly problematic when exposures must be reconstructed from physical measurements, for example, for environmental or occupational radiation doses that were received by a study population for which radiation doses cannot be measured directly. To incorporate complex uncertainties in reconstructed exposures, the two-dimensional Monte Carlo (2DMC) dose estimation method has been proposed and used in various dose reconstruction efforts. The 2DMC method generates multiple exposure realizations from dosimetry models that incorporate various sources of errors to reflect the uncertainty of the dose distribution as well as the uncertainties in individual doses in the exposed population. Traditional measurement-error model approaches, typically based on using mean doses in the dose-exposure analysis, do not fully account exposure uncertainties. A recently developed statistical approach that overcomes many of these limitations by analyzing multiple exposure realizations in relation to disease risk is Bayesian model averaging (BMA). The analytic advantage of the BMA is its ability to better accommodate complex exposure uncertainty in the risk estimation, but a practical. Drawback is its significant computational complexity. In this present paper, we propose a novel frequentist model averaging (FMA) approach which has all the analytical advantages of the BMA method but is much simpler to implement and computationally faster. We show in simulations that, like BMA, FMA yields 95% confidence intervals for association parameters that close to 95% coverage rate. In simulations, the FMA has shorter length of CIs than those of another frequentist approach, the corrected information matrix (CIM) method. We illustrate the similarities in performance of BMA and FMA from a study of exposures from radioactive fallout in Kazakhstan.


Introduction
Traditional statistical methods to accommodate exposure uncertainty when estimating associations of the exposure with an outcome (e.g., disease status), such as regression calibration and the Simulation Extrapolation (SIMEX) method use a single measurement or estimate of the exposure per individual and then correct the bias in the association estimates analytically or via simulations, based on assumptions about the measurement error structure [1,2].More recently, exposure uncertainties have been characterized by generating multiple exposure realizations (referred to herein as "cohort dose vectors") for the entire exposed population, using a strategy that incorporates any sources of random and systematic uncertainties that can be characterized in estimating the exposure.Important examples that use such exposure reconstruction are studies of cancer risk, where the exposure, i.e., the radiation dose received by specific organs in individuals, is obtained from complex dosimetry systems that also consider different components and sources of exposure, and dose estimation error.Statistical methods proposed in the literature for estimating association parameters in a dose-response relationship using multiple exposure realizations include Bayesian model averaging (BMA) [3], a frequentist Corrected Information Matrix approach (CIM) [4], Monte Carlo maximum likelihood (MCML) [5], and multiple imputations for measurement-error correction (MIME) [6].The drawback of the BMA and MCML methods is their substantial computational burden.Additionally, MCML cannot be readily extended to more than one parameter in the doseresponse function since parameter estimation is based on a grid search over the parameter space, which further increases numerical and computational complexities.CIM also is computationally expensive for large datasets as it requires inverting the variance-covariance matrix of the model, and can only be applied to a limited set of statistical models for continuous and count outcomes, but currently does not accommodate binary outcomes.Moreover, CIM implicitly assumes the single cohort vector of individual mean doses obtained by averaging the individual doses across all dose realizations per person is an unbiased estimate of the true dose-an assumption that is fundamentally incompatible with the design of the 2DMC when much of the uncertainty in estimates of the true, unknown individual doses is due to numerous sources of systematic and/or shared errors in the dose estimation.
In this article we propose and study frequentist model averaging (FMA), a novel, computationally efficient method to accommodate uncertainty in the dose-response parameter estimates when analyzing the dose response relationship with multiple exposure realizations.As it is easy to implement, FMA provides a viable alternative to BMA, which, despite all its advantages, requires expertise in Bayesian computation for efficient implementation.
We apply FMA to data from a study that aimed to assess the association of exposure to radioactive fallout from nuclear testing conducted in Kazakhstan with the risk of thyroid nodules in the local population [7].We also compare FMA to CIM using simulations to examine the properties of the respective statistical estimates and their corresponding confidence intervals for different magnitudes and sources of exposure uncertainty.In the Kazakhstan data example we compare the results from FMA to previously published results obtained using BMA.
The remainder of this article is organized as follows: In Section 2 we briefly describe the strategy for assessment of exposures, its uncertainty components and the resulting multiple realizations of cohort vectors of individual doses.We then present the statistical model and dose response estimation methods, BMA and the novel FMA.In Section 3, we show the comparability of BMA and FMA using radiation dose and disease data obtained for the Kazakhstan cohort [7].In Section 4, we compare FMA with CIM in a series of simulations with increasing Accounting for shared and unshared dosimetric uncertainties in the dose-response for ultrasounddetected thyroid nodules following exposure to radioactive fallout.Rad Res.2015; 183:159-173).These data are subject to third party restrictions and while we are authorized to use them for this publication we cannot release them.Interested researchers need to apply (as the authors have) for permission to use the data from the National Cancer Institute, NIH, HHS.Contact person: Todd Gibson todd.gibson@nih.govFunding: DK was partially supported by the Biostatistics/ Epidemiology/ Research Design (BERD) component of the Center for Clinical and Translational Sciences (CCTS) for this project that is currently funded through a grant (UL1TR003167), funded by the National Center for Advancing Translational Sciences (NCATS), awarded to the University of Texas Health Science Center at Houston.RMP was partially supported by the Intramural Research Program of the National Cancer Institute.The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests:
The authors have declared that no competing interests exist.
amounts of systematic and/or shared sources of uncertainty in dose estimation.Section 5 concludes with a discussion.

Methods
Here we first discuss the importance of using multiple exposure realizations in epidemiologic association studies with complex exposure uncertainties and summarize a method for generating these realizations using a two-dimensional Monte Carlo (2DMC) algorithm [8].The 2DMC is useful when there are complex uncertainties in exposures comprised of various, and often differing, degrees of systematic (shared) and random (unshared) errors for members of an epidemiologic cohort.We use the terms "dose" and "exposure" interchangeably throughout this article to refer to radiation dose, but exposure can be derived in other contexts for chemicals or other contaminants.

Exposure assessment
Epidemiologic studies to determine the dose-response association for individuals exposed to radiation, chemical agents, or toxins, have numerous challenges, some of the most important and difficult ones arising from the uncertainty in the exposure estimates for the study participants.
Exposure assessment is a well-developed field and encompasses many methods and strategies depending on the exposing agent (e.g., radiation).Exposure assessment methods can use measurement data on concentrations of the toxin in the environment.However, typically, such information is available only on a group-average or location-average basis and rarely for individual members of a cohort.For that reason, exposure assessment in epidemiologic studies often relies on estimation of individual exposures from mathematical models using appropriate input data to characterize the exposure conditions for each study participant.Exposure assessment models vary widely in their level of complexity, ranging from simple correlation models based on observations between the amount of toxin or radiation absorbed by the body per unit of toxin or radiation in the environment, to detailed mechanistic models that attempt to describe the physical and biological processes leading to exposure and dose.In radiation epidemiology, where estimation of individual exposures and the estimation of exposurerelated uncertainties for each individual dose have become routine practice, most recently Monte Carlo methods have been used [8][9][10].

Shared and unshared errors and generation of multiple exposure realizations using the two-Dimensional Monte Carlo (2DMC) approach
In mathematical models used to reconstruct individual doses to study participants, some parameters have an unknown true, fixed value that is shared among all members of a defined subgroup, e.g., cohort subjects of each gender or within an age group, or residents of a specific town or village.For the members of such a subgroup, this lack of knowledge of the true value is a systematic and/or shared source of uncertainty equally affecting all estimates of dose for those persons.In contrast, some parameters can be treated as varying independently among members of a subgroup and/or between subgroups and are a source of unshared error.Parameter values that are specific to an individual and independent from values specific to other persons are sources of unshared errors.
The importance of shared errors is that they are a source of systematic error among the subgroup sharing the estimated but uncertain parameter value.Systematic errors are a primary contributor to bias in the estimate of the dose response in epidemiological studies.The two-Dimensional Monte Carlo (2DMC) approach was developed to derive numerous possible versions of the unknown true exposure by simulating alternative values for systematic errors as well as unshared errors [8].Each set of study doses, termed "realization" or "cohort dose vectors", contains a dose estimate for each individual in the study and is based on a specific selection of shared and unshared parameter values.
Shared parameters are sometimes characterized by discrete distributions, whereby each randomly selected value can lead to dramatically different numerical estimates of the exposure.Because the probability distributions from which shared parameter values are selected cannot be guaranteed to reflect the true dose generating processes, the average value of the generated dose realizations must not be interpreted as an unbiased estimate of the mean value of the exposures to the study participants.
To characterize the many possible alternative combinations of shared and unshared errors, hundreds to thousands of alternative realizations of cohort dose vectors are generally created for dose-response analyses.These multiple realizations are created independently of one another and could, in theory, be developed from different dose-related data or even from different exposure assessment models.There are no limits on the types of data or on the combinations of exposure assessment models that might be used to generate multiple realizations for a dose-response analysis.The purpose of using different dose assessment algorithms to generate multiple realizations is to provide numerous alternatives of possibly true conditions of shared and unshared errors in dose estimation.
The 2DMC approach has been used in several cohort-based dose reconstructions in radiation epidemiology, including fallout exposures from nuclear testing in Kazakhstan [7], occupational radiation exposure among U.S. medical radiation technologists [11] and pediatric patients exposed to computerized tomographic imaging (CT) [12].In contrast, there is a very simplistic strategy for simulating multiple realizations of cohort dose vectors known as the SUMA ("Shared and Unshared, Multiplicative and Additive") model [9] that we describe in more detail in Section 4 and use to test FMA against CIM.

Statistical models for the association of exposure with disease risk
The main goal of dose-response analysis is to estimate the association parameter that relates exposure to risk of a disease outcome, and the corresponding 95% confidence or credibility interval (CI), accounting for various levels of uncertainty in the exposure estimation.For illustration, we use radiation absorbed dose expressed in units of Gray (Gy) to represent exposure.
Statistical models for risk estimation due to radiation exposure are described in the BEIR VII Report [13].In cohort studies of cancer, the quantity of interest typically is the incidence rate, modeled as a function of radiation dose, age at exposure, gender, and other covariates [14].Incidence is often estimated from grouped survival outcome data using Poisson regression models.Two popular models for the incidence rate as a function of radiation dose D are the 1.Excess Relative Risk (ERR) model: λ 0 [1 + ERR(D)], and the 2. Excess Additive Risk (EAR) model: where λ 0 denotes the baseline incidence rate at zero dose, and ERR(D) and EAR(D) the additional risk conveyed by exposure.Both ERR(D) and EAR(D) can be modeled using many different functional forms, including, but not limited to, linear (β � dose) and linear-quadratic (β � dose + γ � dose 2 ) functions.Individual level (ungrouped) survival outcome data can be analyzed using proportional hazards regression models, similar to the standard Cox regression model.Ignoring time to event information and only using binary outcomes, some authors used logistic regression models to analyze cohort studies [7,15,16].In a logistic model the excess odds ratio (EOR) is used instead of the excess relative risk (ERR).In this article we focus on ERR and EOR modeling but extensions to the EAR model are straight forward.

Model averaging methods: BMA and FMA
Model averaging methods have not been widely used in dose-response analysis applications although there are numerous statistical publications describing their form and usage, e.g.[17][18][19][20].Implicit in all model-averaging strategies is weighting various dose-response models by their goodness of fit to the observed data, i.e., their plausibility in explaining the relationship.Model averaging approaches thus are very well suited to analyzing multiple realizations of possible cohort doses from the 2DMC method.
For both BMA and FMA, first a probability model that relates disease status to exposure is specified.In such a model we let D denote exposure, C denote confounding variables and M possible effect modifying variables.For a binary disease outcome Y, with Y = 1 for event and Y = 0 otherwise, the probability model is typically logistic, where p i = P(Y i = 1), and β is the EOR per unit exposure, D. Letting Θ = (α 0 , α 1 , β, η), the corresponding likelihood function is For individual-level survival data (T,δ), where T denotes the event time and δ the censoring indicator, that is one if the event occurs and zero otherwise, a proportional hazards model can be written as where h 0 (t) denotes the baseline hazard function and β is the ERR per unit of exposure, D. The parameters Θ can be estimated from the partial likelihood function, where Λ denotes the set of event times, R(t) denotes the set of individuals still at risk in the cohort at time t.
For survival outcomes grouped into strata defined by multiple variables, the number of events in stratum i, Y i , has a Poisson distribution, where PY i denotes the person-years for stratum i.The corresponding log-likelihood function is However, in studies with reconstructed dose, instead of the true dose D, which is unobserved, we have K realizations X k , k = 1,. ..,K, of the cohort dose vector (e.g.K = 5,000), that were generated by the 2DMC algorithm described in the previous Section.Thus the data available for the i th individual in the cohort are (Y i , PY i , X i1 , . . .,X iK , C i , M i ).In all models described here, (1), (3), and ( 5), the quantity of interest is the dose-response parameter, β.
We next briefly present two methods for incorporating multiple dose realizations into parameter estimation of the dose-response model and uncertainty quantification, the previously published BMA and the novel FMA methods.

Bayesian model averaging (BMA)
We illustrate the BMA using the logistic model in (1) and omit effect modifying variables, M, for simplicity.The BMA method, described in detail in [3] that we employ in this paper, obtains the posterior probability of the ERR dose-response parameter β in Eq (1) based on using each dose realization in the likelihood (2) weighted by the goodness-of-fit of the corresponding model to the vector of disease outcomes among the individuals in the cohort.As a measure of uncertainty, a credibility interval based on the posterior distribution for all doseresponse parameters is computed.In brief, in addition to the parameters Θ = (α 0 , α 1 , β ) in model ( 1), we introduce the dose index parameter, γ, where γ = i corresponds to the i th dose realization X i being used in the likelihood (2).Thus, the joint posterior distribution of all parameters is given by Underlined letters, Y ; X , and C denote data vectors for the whole cohort.Weakly informative prior distributions p(α 0 ),p(α 1 ) and p(β) are used, for example normal distributions with large variances such as N(0, 1,000), and the prior distribution for γ�{1,. ..,K} is likewise a only weakly informative multinomial distribution, multinomial ðp Þ, where K denotes the number of individual dose realizations, and the hyperprior for γ, p ¼ p 1 ; . . .; p K ð Þ is Dirichlet(1, . . .,1).For these choices of priors and hyperpriors every dose realization has a priori equal weight 1/K and thus every realization is equally likely to give the best fit to the data.Using weakly informative or non-informative priors in the Bayesian analysis reduces the impact of the prior on the inference and thus maximizes the influence of data.The weakly informative prior for the dose index parameter, γ, reflects our lack of knowledge about which dose realization is closest to the true unknown dose, and sets all dose realization to be equally likely to be closest to the true dose.As γ is a categorical variable, its prior is usually a Dirichlet prior with parameters all equal to one.The posterior distribution for the parameters can be obtained from a Markov Chain Monte Carlo (MCMC) algorithm using the Metropolis-Hastings (MH) method [21].
We also compute a Bayesian weight (BW k ), namely the relative selection frequency for a specific realization of a cohort dose vector X k , is the marginal posterior distribution of the k th cohort dose realization.BW k in Eq (8) is a relative measure of the goodness-of-fit of the k th cohort dose vector realization X k to the data, with larger values corresponding to a better fit of X k to the observed disease outcomes.The Bayesian credibility interval for the BMA estimate of β is obtained from the highest posterior density (HPD) interval given a specific credible level (e.g., 95%).Unlike the currently available version of the CIM approach, BMA can handle any dose-response model, regardless of whether the disease outcomes are expressed as continuous, binary, count, or survival data.For further details about BMA see [3].

Frequentist model averaging (FMA)
A drawback of the BMA approach is that it is extremely expensive computationally with large dataset.In implementing the BMA, the whole dataset has to be read into memory and a large model space needs to be evaluated to obtain posterior model probabilities.For example, it takes several days on a personal computer with a multicore processor and 24GB memory to get the BMA result for the example data in this paper using traditional Bayesian computational softwares such as WinBUGS [22] and JAGS [23].When we have complex uncertainties, conventional Markov chain Monte Carlo (MCMC) methods such as Gibbs sampling and the Metropolis-Hastings algorithm face the 'local trap' problem due to many local optima of the posterior distribution of β.Advanced Bayesian computation techniques are then needed to get a proper posterior distribution of β, which requires special programming expertise to perform the analysis.To avoid the computational burden of BMA, we, therefore, propose a new frequentist model averaging (FMA) approach for estimating the ERR(D), β, and its corresponding 95% confidence interval.
Before describing our specific FMA approach, we provide a general description of the FMA idea, see also [24].In the FMA approach, distinct models fit to the data are averaged."Different models" could refer to different parametric forms or models including different sets of independent variables.For example, assume that there are three independent variables (X 1 , X 2 , X 3 ) available for inclusion in a linear regression model, leading to seven different possible models (Model 1 (M 1 ): X 1 is only included; Model 2 (M 2 ): X 2 is only included; Model 3 (M 3 ): X 3 is only included; Model 4 (M 4 ): X 1 and X 2 are included; Model 5 (M 5 ): X 1 and X 3 are included; Model 6 (M 6 ): X 2 and X 3 are included; and Model 7 (M 7 ): X 1 , X 2 , and X 3 are included) assuming at least one independent variable is significant.The FMA estimates of the regression coefficients are then obtained as the weighted average of the coefficient estimated coefficients of the different models, using model weights (MW) based on a goodness-of-fit (GOF) criterion [25], e.g. the Akaike Information Criterion (AIC), defined as AIC ¼ 2p À 2 lnð LÞ, where p denotes the number of estimated parameters in the model and L denotes the maximum value of the model's likelihood function], where . . .; 7 and βM k denotes the coefficient estimator of Model k.FMA for generalized linear models (GLMs) and generalized linear mixed models (GLMMs) were discussed, for example, in [26,27] and FMA for high-dimensional data was considered in [28,29].
Here we present FMA in a special setting, since all models have the exact same parametric form (i.e., the same parametric regression model with the same number of parameters) unlike the example described above.The difference between the models is the specific exposure vector that is used as the independent variable.The corresponding model weights MW k can be viewed as corresponding to the posterior probabilities of the parameters γ = k, k = 1,. ..,K in BMA.However, in contrast to the BMA method that simultaneously evaluates all dose realizations, our FMA strategy uses a two-stage approach.In the first stage, each realization of a cohort dose vector, X k , k = 1,..,K, along with the vector of disease outcomes and associated confounding variables is evaluated separately using the same parametric model form.The resulting K models can be easily fit in parallel runs, unlike BMA which evaluates all dose vectors jointly and for which computations cannot be parallelized.As FMA allows for parallel implementations there are huge savings in computational time when the sample size, N, and the number of exposure realizations, K, are large.To run BMA, the whole dataset has to be read into computer memory space, which typically prevents BMA analyses on standard single computers with large datasets.Each model yields an estimate of the ERR(D), bk , the corresponding variance, var bk

� �
, and GOF measure.Here we use the AIC [24] to compute a frequentist weight, In

Simulations and data example to illustrate FMA and its comparability with BMA
We use data from the study by Kwon et al. [3] to further illustrate the FMA approach and compare it to results obtained from the BMA method.The study cohort consists of 2,376 individuals under the age of 21 years between August 1949 and September 1962, who lived in villages in the downwind area of the Semipalatinsk Nuclear Test Site located in northeast Kazakhstan.Radiation dose estimates for individuals were based on standard dose reconstruction models (NCRP 2009) while the multiple realizations were derived using the 2DMC approach [8] supplemented with probability distributions describing interview data on exposure-related lifestyle habits [30].Using the 2DMC, 5,000 distinct radiation exposure realization vectors were generated for this analysis.The aim of the analysis was to estimate the association of reconstructed fallout-related internal and external radiation doses to the thyroid gland and the prevalence of thyroid nodules (absent/present) using a logistic regression model.We analyzed the binary outcome Y = 0 or 1 (thyroid nodules absent or present) with p i = P(Y i = 1) modeled using the logistic model where β is the excess odds ratio per Gy (EOR).

Comparing FMA and BMA using simulated settings
First, we compared BMA and FMA using the 360 settings (3 different β values × 2 uncertainty scenarios × 60 test disease outcome vectors) examined in the simulation study in [3].In brief, we simulated multiple alternative realizations of a "true" set of disease status variables (i.e., the disease status for the 2,376 individuals in the cohort).Each simulated set of disease status variables was produced using one of the 5,000 vectors of doses for the cohort (subsequently removed from the multiple realizations, leaving 4,999 for testing), one of four pre-specified values of a "true" slope, β, (EOR/Gy = 0, 3, 12, and 20), and covariates that affect the baseline risk of thyroid nodules (i.e., age at time of screening and sex).We used the following operating characteristics to assess performance of the methods: the coverage rate of the 95% confidence intervals (CR), the left error rate (LER), the right error rate (RER), the length of the 95% confidence interval (LCI), the absolute relative bias of EOR (ARB), and the relative upper limit (RUL).CR, RER, and LER were defined as the frequency (percentage of times) the true value of parameter was inside the CI, below the lower limit, and above the upper limit, respectively.
RUL was calculated as the ratio of the upper limit of the 95% CI and bFMA bBMA ).The performance measures for FMA and BMA are shown in Table 1.
In settings with small systematic uncertainties, as was the case for doses only describing external exposure to fallout deposited on the ground surface, both BMA and FMA yielded almost identical results in terms of CR, ARB, LCI and RUL (Table 1).In settings with substantially larger uncertainty, FMA showed slightly higher coverage rates for the two larger test values (EOR = 12 and 20).These higher coverage rates of FMA are due to the longer length of the FMA 95% CIs.However, for EOR = 3, FMA had a similar performance to that of BMA.To compare the efficiency of BMA and FMA, we present the averages of their respective coverage rates, which were very similar, about 97.1% (BMA) and 98.4% (FMA) for the setting of small uncertainty; and 95.8% (BMA) and 96.7% (FMA) for the setting of large uncertainty.In Fig 2A and 2B FMA shows quite comparable weighted mean estimates of EOR as BMA and the respective confidence/credible intervals are also similar.Comparing FMA and BMA using the observed cohort data Table 2 shows results obtained from applying the risk evaluation strategies from two models described in [7] using the same 2DMC dataset for uncertain doses and the actual disease vector of cohort members with and without thyroid nodules.In this evaluation of the unknown slope of a linear dose response relationship, Model 1 (external and internal dose) describes the (EOR) separately in men and women and Model 2 (external dose only) describes the slope between external and internal exposures for both sexes combined.
In the actual epidemiological investigation [7], unlike in the simulated situations in Table 1, the true value of the EOR(D) was unknown.The central estimates of EOR(D) from FMA were slightly smaller than those from BMA.The upper limits of the 95% CIs from both BMA and FMA were similar.However, unlike BMA, the lower limits from FMA were near zero.The discrepancy in the lower limits arises because BMA incorporates a non-negativity constraint for the EOR estimate by using the exponential distribution as a prior distribution of β.

Simulation study comparing FMA with CIM
We further compared the performance of the FMA method with the Corrected Information Matrix approach (CIM) [4], another frequentist approach to accommodate exposure uncertainty in the estimation of CIs.The CIM approach is described in detail in the Appendix.As currently the CIM only accommodates linear and Poisson regression models, we could not include it in the comparisons based on the Kazakhstan cohort.

Data generation
Here, we simulated cohort disease vectors from a Poisson regression model, i.e., we assumed that the count outcome, Y i , for stratum i had a Poisson distribution shown in Eq (3).We used scalar confounding and effect modification variables here and denote the vector of all parameters by Θ = (α 0 , α 1 , β, η).For a cohort of N individuals the true disease generating data are (Y i , PY i , D i , X i , C i, M i ), i = 1,. ..,N.Letting Y ; PY � ; C ; M ; and D denote the vectors of count outcomes, person-years, confounder, effect modifier, and exposure, respectively, for the whole cohort, the corresponding log-likelihood is shown in the Eq (5).To efficiently generate exposure vectors, we used the shared uncertainty multiplicative/ additive (SUMA) error model [9] as a simplistic replacement for a complex dosimetry system.SUMA assumes that the distribution of the true dose, X i , has four error components, a shared multiplicative error (ε SM ), an unshared multiplicative error (ε UM,i ), a shared additive error (ε SA ), and an unshared additive error (ε UA,i ).The multiplicative errors are assumed to be lognormally distributed and additive errors are assumed to be normally distributed.The dose X i , given the errors and the mean dose, Z i , is defined as where where S, U, M, and A indicate shared, unshared, multiplicative, and additive errors, respectively.In our simulations we only considered shared and unshared multiplicative errors in (8) and generated X i = ε SM ε UM Z i We thus used only two sources of errors that were simulated using two multiplicative parameters, one with uncertainty shared by all cohort members, and the other varied at random for individuals in the cohort.Under this simplification SUMA generates uncertainties corresponding to Berkson type errors, i.e. the true values vary about the mean at random [1].We let and examined the impact of different magnitudes of exposure uncertainty on interval estimation using four scenarios with different values for s 2 SM and s 2 UM , summarized in Table 3.The first scenario uses the same values as used in [4], and the other scenarios had twice and three times the magnitude of the shared multiplicative error.
For each scenario we first generated 5,000 realizations of a possibly true cohort dose vector with sample size of N = 1,000 individuals in the cohort.Fig 3 illustrates the degree of uncertainty for each scenario by plotting the cumulative distribution function (CDF) of the study doses.The thick black line represents the single vector of individual mean doses, i.e., the vector of individual doses averaged per each person across all dose realizations.Each thin gray line represents one of the 5,000 realizations of the cohort dose vector per test scenario.The width of the gray area in the plots represents the degree of shared uncertainty in the dose distributions, where a wider gray area corresponds to a larger shared uncertainty component.As can be seen, each scenario represents different degrees of shared uncertainty.
In this exercise, one of the generated dose vectors was assumed to be the "true dose" to generate the disease outcome vectors Y from the Poisson regression model described in Table 4 with true values of β = 0 and of β = 3 for the dose-response regression coefficient.Age was generated using a uniform distribution with support 20 to 50, and gender was generated using a Bernoulli distribution with p = 0.5 to obtain equal proportions of men and women.The person-years for a stratum were generated from an exponential distribution with rate = 0.5.In the model fitting step, we excluded the specific cohort dose vector used to generate the disease out- comes to reflect a realistic setting for analysis.Thus, for each test 4,999 realizations were evaluated for their goodness of fit to the dose response for both CIM and FMA.We summarize the simulation results using the same operating characteristics we used in Section 3. In addition to β = 3, we also evaluated the operating characteristics for the possibility of no association of dose with outcome (β = 0).

Simulation results
We first generated 5,000 different count outcome vectors for individual disease status in the cohort under the null model (β = 0) using doses generated under Scenario 1 in Table 3, and examined LER, CR, RER, absolute bias (AB), LCI, and the upper limit of 95% confidence interval (UL).As shown in Table 5, both CIM and FMA had similar operating characteristics and the coverage rates of the corresponding 95% CIs were close to the nominal rate of 95%.
Table 6 shows CR, LER, RER, LCI, ARB, and RUL when disease outcome data were generated from a Poisson model with β = 3 for doses generated under all settings in Table 4.Here  1 and 2. https://doi.org/10.1371/journal.pone.0290498.g003 both CIM and FMA also had empirical CRs close to the nominal 95% level for all scenarios.The averages of the coverage rates of FMA and CIM based CIs were about 95.7% and 96.2%, respectively.The two tail error rates of CIM were similar, but FMA had a higher LER than RER.That suggests that FMA is more likely to underestimate β, but less likely to overestimate β.CIM had slightly smaller absolute relative bias (ARB,) than FMA.
As shown in Table 6, when doses were generated under Scenario 3, CIM had an almost 31% larger average LCI compared to FMA (8.18 vs 6.22) and 60% larger RULs (3.12 vs 1.95).While CIM and FMA based CIs had similar coverage rates, around 95%, those obtained from FMA were much shorter in length with a lower upper limit.Thus the 95% CIs from FMA were more informative than those obtained from CIM when the magnitude of systematic and/or shared uncertainty was large.For large systematic and/or shared uncertainty in dose estimation, the length of the CIs for CIM increased substantially.For easy visualization in comparing CIM and FMA for each scenario, 100 tests were randomly selected from 5,000 tests and were rearranged by ERR(D) estimate of CIM (Fig 4, with CIM in left column and FMA results in the right column).Most FMA upper limits of the 95% CIs were smaller than those of CIM.In Scenario 1, 87% of the upper limits of FMA based CIs were smaller than those of the CIM based CIs.For other scenarios, around 95% of the upper limits of FMA based CI were smaller than those of the CIM based Cis.

Discussion
In this paper, we propose a computationally efficient and relatively simple novel frequentist model averaging approach (FMA) to accommodate multiple exposure realizations for the analysis of exposure-response relationships.For the linear excess odds ratio (EOR) model with a binary outcome we used in our real data analysis and the simulation study, the prior for the BMA approach for the EOR/Gy, β, was a truncated normal distribution, truncated at zero (i.e., non-negative constraint).This ensures that 1 + βD i > 0 in the function log[1 + βD i ], that is part of the EOR model.Although this non-negativity constraint for the EOR estimation in the BMA approach introduced differences in the lower limit of 95% CIs, the discrepancy of the lower bound was negligible for small or moderate estimates of EOR/Gy (see Table 2).We show that FMA provides association and uncertainty estimates comparable to those of the and PY i is person-years for i th stratum https://doi.org/10.1371/journal.pone.0290498.t004 Table 5. Performance of CIM and FMA for outcomes generated from a Poisson regression with β = 0 (CR: 95% CI coverage rate, LER: left error rate, RER: right error rate, LCI: length of 95% CIs, AB: absolute bias, UL: upper limit) for the null model (β = 0) using doses generated under Scenario 1. Results are summarized from 5,000 tests with sample size of 1,000 using different pre-selected 2DMC realizations of a cohort dose vector for each of the 5,000 tests.CR, RER, and LER were defined as the percentage of times the true value of the slope of the null dose response (β = 0) was inside the 95% CI, and below the lower limit, or above the upper limit, respectively.computationally much more burdensome BMA.In a simulation study, we compared the operating characteristics of the FMA to CIM, which also accommodates multiple dose realizations in the computation of the confidence intervals of the association estimates.We used a simple two parameter SUMA algorithm for generating exposure realizations assuming all shared and unshared uncertainties were random and multiplicative, and centered on the single cohort vector of individual mean doses, essentially implying a Berkson error model, where the cohort vector of individual mean dose is an unbiased surrogate for the true dose vector.While SUMA is easy to use for simulations, the 2DMC method makes no assumptions that the vector of mean doses per person is unbiased.Thus, these simulation results have to be interpreted with caution due to the simplified structure inherent in the two-parameter SUMA method.

Performance
As with regression calibration and conventional regression, CIM uses the single cohort vector of individual mean doses (averaged over all multiple realizations for each cohort member) for estimating the dose-response association parameter.This results in CIM estimates with a relatively small absolute bias when averaged across all tests.Different properties of the estimated confidence intervals between CIM and FMA come from their differences in aggregating the K different models.CIM does not distinguish dose vectors that yield models with a good fit to the cohort outcomes and models with poor fit in the estimation of the variance of β.In addition, CIM assumes that the single vector of individual mean doses is a good approximation to the unknown true dose.This assumption is likely unrealistic.In most reconstructions of past radiation exposure, there is a substantial lack-of-knowledge about the exposure conditions and the parameters of the dose estimation model are associated with large and complex amounts of systematic and/or shared uncertainty.
FMA weighs the estimates of β obtained for different dose vectors by their goodness of fit to the disease outcome data.If a particular model weight is very small (e.g., less than 0.001), then the corresponding model contributes minimally to the estimate of β and its confidence interval.This is reflected in FMA yielding shorter confidence intervals for the estimates of β than CIM.In Fig 5, FMA weights were displayed for selected three tests for each scenario.Among the 4,999 realizations of possibly true cohort dose vectors, only a small proportion had goodness-of-fit weights of impactful magnitude.
Previously, Kwon et al. [3] proposed a Bayesian model averaging (BMA) approach to estimate dose-response parameters while accommodating shared uncertainties in exposures.We proposed FMA as a computationally more efficient alternative to BMA that is also much easier to implement using available statistical software such as SAS, R, or EPICURE [31][32][33].For Table 6.A comparison of CIM and FMA estimates for the slope of a linear dose response (ERR/Gy) for outcomes generated from a Poisson model (CR: 95% CI coverage rate, LER: left error rate, RER: right error rate, LCI: length of 95% CIs, ARB: absolute relative bias, RUL: relative upper limit).CR, RER, and LER were defined as the percentage of times the true value of the slope of the dose response was inside the CI, below the lower limit, and above the upper limit, respectively.ARB was calculated as the absolute difference between the estimated value of the slope of a linear dose response and the pre-determined true value (β = 3) divided by the true value of the slope.FMA, given an exposure realization, any conventional dose-response analysis can be performed that yields estimates of the regression coefficients for the response function, corresponding standard errors, and goodness-of-fit measures (e.g., AIC, BIC, DIC) [28] which can then be aggregated.In our simulations based on real data we found that FMA and BMA had very similar performance characteristics.

Simulation
Another limitation of BMA is that it is computationally challenging for very large datasets.For example, for analyzing risk in the EPI-CT study, in which radiation-induced cancer risk from pediatric computed tomography (CT) scans was assessed for over 900,000 individuals [34], the use of BMA would be a daunting task and require unrealistically large computational resources, while the use of FMA appears eminently feasible.Another computational advantage of FMA is that the implementation can easily be parallelized.This is impossible for both BMA and CIM since all realizations are considered simultaneously in these two approaches.Thus, FMA provides a viable computational approach with considerable advantages for the analysis of very large datasets with multiple exposure realization.

Appendix: Corrected information matrix (CIM) method
The CIM method [4] estimates the regression coefficients for the dose-response function by using the mean dose, Z = E(X|W), in model (1) instead of X, where W denotes all available information about dose, e.g., years of residency, age at exposure and sex.For individual i in the cohort, E(X|W) is estimated by the empirical mean of X i1 ,. ..,X ik , the corresponding dose realizations.The key (untestable and very strong) assumption for the CIM approach is that the conditional mean Z = E(X|W) is an unbiased estimate of the individuals' true dose X.To accommodate exposure uncertainty in the confidence intervals for the regression coefficients when using Z instead of individual doses X, a correction term for the variance that is a function of Cov(X|Z) is used, as described here in brief.Based on a Taylor expansion of the score function S Z corresponding to the log-likelihood in (6) around the true vector of parameter values Θ, we obtain Ŷ À Y � I À 1 Z S Z , where I Z is the Fisher information.Both, I À 1 Z and S Z , are estimated using the mean dose, Z. From the Taylor expansion we also have that Varð ŶÞ � I À 1 Z Var S Z ð ÞI À 1 Z , where Var S Z ð Þ ¼ I Z þ b 2 QGCovðX j ZÞGQ 0 , and Q = [Q 1 , . . ., Q N ] with A Wald-type corrected α level confidence interval for the ERR parameter β in ( 5) is then given by b � z ð1À a=2Þ ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi where z (1-α/2) denotes the 1 -α/2 quantile of the standard normal distribution.A limitation of the CIM approach is that it can be applied only to linear and Poisson regression models, but not logistic or Cox proportional hazards models.To fit Poisson models to large datasets, CIM requires aggregating the data into bins of individuals of the same confounder, effect modifier and dose combinations to make computations feasible.
stage 2, given FW k , we then simulate n k = FW k ×M values bk ðsÞ from the normal density N bk ; var bk � � � � , for each k = 1,. ..,K and an arbitrarily large M (e.g., M = 100,000).From this setup, we obtain M samples of b.The final FMA estimate bFMA , and its corresponding 95% confidence interval are obtained as the weighted mean of the M simulated b values, and the 2.5 th and 97.5 th percentiles of their corresponding empirical distribution function.The FMA approach is summarized below in Fig 1.

Fig 1 .
Fig 1. Schema of simulation-based frequentist model averaging (FMA) approach.GOF stands for goodness-of-fit measure.AIC stands for Akaike information criterion, defined as AIC = 2p -2 In( L ), where p denotes the number of estimated parameters in the model and L denotes the maximum value of the likelihood function for the model.https://doi.org/10.1371/journal.pone.0290498.g001

Fig 2 .
Fig 2. A. Comparison of BMA (left column) and FMA (right column) for external KZ dose.The red vertical line denotes 95%CI does not include the true value.The dashed horizontal line denotes the value of true EOR Gy (= 20, 12, and 3).B. Comparison of BMA (left column) and FMA (right column) for external KZ dose.The red vertical line denotes 95%CI does not include the true value.Dashed horizontal line denotes the value of true EOR Gy (= 20, 12, and 3).https://doi.org/10.1371/journal.pone.0290498.g002

Fig 3 .
Fig 3. Cumulative distribution function plots of doses for the four scenarios with increasing uncertainty corresponding to the settings in Tables1 and 2.

Fig 4 .
Fig 4. Comparison between CIM (left column) and FMA (right column) using 100 selected tests for the four scenarios with increasing uncertainty.Red vertical line denotes 95%CI does not include the true value.Upward dashed line on the right column indicates for lower and upper limits of CIM for easy comparison.Dashed horizontal line denotes the value of true ERR/Gy (= 3).https://doi.org/10.1371/journal.pone.0290498.g004

Table 1 . Summary of comparison between BMA and FMA for 240 simulated test data sets.
Kwon et al. 2016e reproduced fromKwon et al. 2016.For each EOR/Gy, 60 tests were performed.#denotes average of absolute bias and ^denotes average of upper limits.https://doi.org/10.1371/journal.pone.0290498.t001

Table 2 . Comparison of BMA and FMA analyses for EOR/Gy for males vs. females using total dose (external + internal) and external vs. internal dose (genders com- bined).
Results are for the specific exposure conditions of the Kazakhstan cohort(Land et al.2015).

Table 3 . Variances used for data generation in the simulation study based on the SUMA model
. s 2 SM denotes the variance of the shared multiplicative error (ε SM ), and s 2 M the variance of the unshared multiplicative error (ε UM,i ).See Fig 3 for a graphical depictions of scenario realizations.

Table 4 . Model for simulating cohort disease vectors. Outcome; Statistical model Model
Count outcome Y; Poisson regression y i *Poisson(λ i × PY i ) where